Instructor names and contact information

  1. Benjamin P. Berman
  2. Tiago C. Silva

Workshop Description

This workshop demonstrates how to perform ELMER analysis using matched RNA-seq and DNA methylation data.

You can find a detailed information about ELMER at http://bioconductor.org/packages/3.10/bioc/vignettes/ELMER/inst/doc/index.html.

Articles about ELMER:

Pre-requisites

  • Basic knowledge of R syntax
  • Familiarity with the SummarizedExperiment classes
  • Familiarity with ’omics data types including DNA methylation and gene expression

Workshop Participation

Students will have a chance to run ELMER analysis on a provided MultiAssayExperiment object created from TCGA data from GDC data portal.

Goals and objectives

  • gain familiarity ELMER input data, a MultiAssayExperiment object
  • Execute ELMER analysis on real data and understand its meaning

R/Bioconductor packages used

Retrieving the data

The RNA-seq data, DNA methylation data and patients metadata (i.e age, gender) used in this workshop is structured as a MultiAssayExperiment. You can read more about a MultiAssayExperiment object at http://bioconductor.org/packages/MultiAssayExperiment/ and https://bioconductor.github.io/BiocWorkshops/workflow-for-multi-omics-analysis-with-multiassayexperiment.html.

Through the next section we will:

  1. load the data
  2. Verify the RNA-seq data
  3. Verify the DNA methylation data
  4. Verify the samples metadata

Download the data

The data is available in this google drive.

Loading the data

## A MultiAssayExperiment object of 2 listed
##  experiments with user-defined names and respective classes. 
##  Containing an ExperimentList class object of length 2: 
##  [1] DNA methylation: RangedSummarizedExperiment with 148951 rows and 171 columns 
##  [2] Gene expression: RangedSummarizedExperiment with 56830 rows and 171 columns 
## Features: 
##  experiments() - obtain the ExperimentList instance 
##  colData() - the primary/phenotype DataFrame 
##  sampleMap() - the sample availability DataFrame 
##  `$`, `[`, `[[` - extract colData columns, subset, or experiment 
##  *Format() - convert into a long or wide DataFrame 
##  assays() - convert ExperimentList to a SimpleList of matrices

Verify the RNA-seq data

## ExperimentList class object of length 2: 
##  [1] DNA methylation: RangedSummarizedExperiment with 148951 rows and 171 columns 
##  [2] Gene expression: RangedSummarizedExperiment with 56830 rows and 171 columns
## [1] 56830
## [1] 171
## GRanges object with 56830 ranges and 3 metadata columns:
##                   seqnames              ranges strand | ensembl_gene_id
##                      <Rle>           <IRanges>  <Rle> |     <character>
##   ENSG00000000003     chrX 100627109-100639991      - | ENSG00000000003
##   ENSG00000000005     chrX 100584802-100599885      + | ENSG00000000005
##   ENSG00000000419    chr20   50934867-50958555      - | ENSG00000000419
##   ENSG00000000457     chr1 169849631-169894267      - | ENSG00000000457
##   ENSG00000000460     chr1 169662007-169854080      + | ENSG00000000460
##               ...      ...                 ...    ... .             ...
##   ENSG00000281904     chr2   90365737-90367699      + | ENSG00000281904
##   ENSG00000281909    chr15   22480439-22484840      - | ENSG00000281909
##   ENSG00000281910    chr16   58559796-58559931      - | ENSG00000281910
##   ENSG00000281912     chr1   45303910-45305619      + | ENSG00000281912
##   ENSG00000281920     chr2   65623272-65628424      + | ENSG00000281920
##                   external_gene_name original_ensembl_gene_id
##                          <character>              <character>
##   ENSG00000000003             TSPAN6       ENSG00000000003.13
##   ENSG00000000005               TNMD        ENSG00000000005.5
##   ENSG00000000419               DPM1       ENSG00000000419.11
##   ENSG00000000457              SCYL3       ENSG00000000457.12
##   ENSG00000000460           C1orf112       ENSG00000000460.15
##               ...                ...                      ...
##   ENSG00000281904         AC233263.6        ENSG00000281904.1
##   ENSG00000281909            HERC2P7        ENSG00000281909.1
##   ENSG00000281910           SNORA50A        ENSG00000281910.1
##   ENSG00000281912          LINC01144        ENSG00000281912.1
##   ENSG00000281920         AC007389.5        ENSG00000281920.1
##   -------
##   seqinfo: 24 sequences from an unspecified genome; no seqlengths
##                 TCGA-2H-A9GF-01A-11R-A37I-31 TCGA-2H-A9GG-01A-11R-A37I-31
## ENSG00000000003                     17.63165                    16.758267
## ENSG00000000005                      0.00000                     7.721927
## ENSG00000000419                     19.33125                    19.188030
## ENSG00000000457                     15.87565                    16.251205
##                 TCGA-2H-A9GH-01A-11R-A37I-31 TCGA-2H-A9GI-01A-11R-A37I-31
## ENSG00000000003                    17.624508                     17.89604
## ENSG00000000005                     9.236257                      0.00000
## ENSG00000000419                    19.765411                     20.90188
## ENSG00000000457                    15.745243                     15.92405

Verify the DNA methylation data

## ExperimentList class object of length 2: 
##  [1] DNA methylation: RangedSummarizedExperiment with 148951 rows and 171 columns 
##  [2] Gene expression: RangedSummarizedExperiment with 56830 rows and 171 columns
## [1] 148951
## [1] 171
## GRanges object with 148951 ranges and 4 metadata columns:
##              seqnames              ranges strand | address_A address_B
##                 <Rle>           <IRanges>  <Rle> | <integer> <integer>
##   cg08258224     chr1       864703-864704      - |  23712478      <NA>
##   cg13938959     chr1       898803-898804      + |  38601418      <NA>
##   cg12445832     chr1       898915-898916      - |  70726486      <NA>
##   cg23999112     chr1       898976-898977      - |  49747457      <NA>
##   cg11527153     chr1       902156-902157      + |  47662324      <NA>
##          ...      ...                 ...    ... .       ...       ...
##   cg16428758     chrX 155539968-155539969      - |  28623392      <NA>
##   cg05682970     chrX 155609878-155609879      - |  34723345      <NA>
##   cg07211220     chrX 155616254-155616255      + |  33714463      <NA>
##   cg25059696     chrY     7563149-7563150      + |  61613414      <NA>
##   cg13851368     chrY   11954167-11954168      + |  36601306  32634316
##                  channel  designType
##              <character> <character>
##   cg08258224        Both          II
##   cg13938959        Both          II
##   cg12445832        Both          II
##   cg23999112        Both          II
##   cg11527153        Both          II
##          ...         ...         ...
##   cg16428758        Both          II
##   cg05682970        Both          II
##   cg07211220        Both          II
##   cg25059696        Both          II
##   cg13851368         Grn           I
##   -------
##   seqinfo: 26 sequences from an unspecified genome; no seqlengths
##            TCGA-2H-A9GF-01A-11D-A37D-05 TCGA-2H-A9GG-01A-11D-A37D-05
## cg08258224                    0.9068315                    0.8888116
## cg13938959                    0.2791042                    0.2178040
## cg12445832                    0.1900486                    0.1202698
## cg23999112                    0.3198171                    0.1969021
##            TCGA-2H-A9GH-01A-11D-A37D-05 TCGA-2H-A9GI-01A-11D-A37D-05
## cg08258224                    0.8247016                    0.8174693
## cg13938959                    0.3241953                    0.3408468
## cg12445832                    0.2290422                    0.1741125
## cg23999112                    0.4268968                    0.4687823

Verify the samples metadata

## DataFrame with 4 rows and 4 columns
##                            sample      patient shortLetterCode
##                       <character>  <character>     <character>
## TCGA-2H-A9GF-01A TCGA-2H-A9GF-01A TCGA-2H-A9GF              TP
## TCGA-2H-A9GG-01A TCGA-2H-A9GG-01A TCGA-2H-A9GG              TP
## TCGA-2H-A9GH-01A TCGA-2H-A9GH-01A TCGA-2H-A9GH              TP
## TCGA-2H-A9GI-01A TCGA-2H-A9GI-01A TCGA-2H-A9GI              TP
##                           definition
##                          <character>
## TCGA-2H-A9GF-01A Primary solid Tumor
## TCGA-2H-A9GG-01A Primary solid Tumor
## TCGA-2H-A9GH-01A Primary solid Tumor
## TCGA-2H-A9GI-01A Primary solid Tumor
## [1] "Adenocarcinoma, NOS" "Adenocarcinoma, NOS" "Adenocarcinoma, NOS"
## [4] "Adenocarcinoma, NOS"
## 
##          Metastatic Primary solid Tumor Solid Tissue Normal 
##                   1                 161                   9
##                                             
##                                              Metastatic
##   Adenocarcinoma, NOS                                 0
##   Basaloid squamous cell carcinoma                    0
##   Mucinous adenocarcinoma                             0
##   Squamous cell carcinoma, keratinizing, NOS          0
##   Squamous cell carcinoma, NOS                        1
##   Tubular adenocarcinoma                              0
##                                             
##                                              Primary solid Tumor
##   Adenocarcinoma, NOS                                         78
##   Basaloid squamous cell carcinoma                             1
##   Mucinous adenocarcinoma                                      1
##   Squamous cell carcinoma, keratinizing, NOS                   4
##   Squamous cell carcinoma, NOS                                76
##   Tubular adenocarcinoma                                       1
##                                             
##                                              Solid Tissue Normal
##   Adenocarcinoma, NOS                                          7
##   Basaloid squamous cell carcinoma                             0
##   Mucinous adenocarcinoma                                      1
##   Squamous cell carcinoma, keratinizing, NOS                   0
##   Squamous cell carcinoma, NOS                                 1
##   Tubular adenocarcinoma                                       0

Selecting the samples

Since we will use only Primary solid Tumor samples we will remove the Metastatic and Solid Tissue Normal samples from our object

Performing ELMER analysis

We will first define some parameters used in the majority of ELMER function

Identifying hypo methylated distal probes between ESAD samples compared to ESCC samples

## Supervised mode will use all samples from boths groups. Setting argument minSubgroupFrac to 1
## ELMER will search for probes hypomethylated in group Adenocarcinoma, NOS (n:78) compared to Squamous cell carcinoma, NOS (n:76)
## ooo Arguments ooo
## o Number of probes: 148951
## o Beta value difference cut-off: 0.3
## o FDR cut-off: 0.01
## o Mode: supervised
## o % of samples per group in each comparison: 1
## o Min number of samples per group in each comparison: 5
## o Nb of samples group1 in each comparison: 78
## o Nb of samples group2 in each comparison: 76
## Output direction: analysis/primary_diagnosis-Adenocarcinoma__NOS_vs_Squamous_cell_carcinoma__NOS_supervised/hypo
## ooooooooooooooooo
## Progress disabled when using parallel plyr
## Saving results
## Number of relevant probes found: 1524
##                 probe       pvalue
## cg19860160 cg19860160 3.178629e-41
## cg02838683 cg02838683 7.322499e-46
## cg22512779 cg22512779 1.305816e-45
## cg17094363 cg17094363 4.500204e-35
## cg20429981 cg20429981 1.834374e-22
## cg01189072 cg01189072 9.596881e-18
##            Adenocarcinoma..NOS_Minus_Squamous.cell.carcinoma..NOS
## cg19860160                                             -0.4256266
## cg02838683                                             -0.3138876
## cg22512779                                             -0.3538253
## cg17094363                                             -0.4751664
## cg20429981                                             -0.3227621
## cg01189072                                             -0.3771010
##                adjust.p
## cg19860160 4.782424e-38
## cg02838683 2.423763e-42
## cg22512779 4.225151e-42
## cg17094363 2.376985e-32
## cg20429981 1.790510e-20
## cg01189072 5.028016e-16

Correlate RNA-seq expression and DNA methylation

Now that we have our hypomethylated probes we will look for the 20 nearest genes
to each probes to identify if any of these genes could being affected by the hypomethylation in the distal regulatory region.

## Searching for the 20 near genes
## Identifying gene position for each probe
## # A tibble: 6 x 5
## # Groups:   ID [1]
##   ID         GeneID          Symbol     Distance Side 
##   <chr>      <chr>           <chr>         <dbl> <chr>
## 1 cg00009553 ENSG00000261590 AC018554.2 -1462705 L10  
## 2 cg00009553 ENSG00000278499 AC018554.3 -1460699 L9   
## 3 cg00009553 ENSG00000261436 AC018554.1 -1377934 L8   
## 4 cg00009553 ENSG00000261310 AC009081.1 -1296929 L7   
## 5 cg00009553 ENSG00000259844 GNPATP     -1163530 L6   
## 6 cg00009553 ENSG00000261178 AC009169.1  -891638 L5

Now that we have our hypomethylated probes and the 20 nearest genes we want to identify if any of these genes could being affected by the hypomethylatio in the distal regulatory region.

## Using pre-defined groups. U (unmethylated): Adenocarcinoma, NOS, M (methylated): Squamous cell carcinoma, NOS
## -------------------
## * Filtering probes
## -------------------
## For more information see function preAssociationProbeFiltering
## Making sure we have at least 5% of beta values lesser than 0.3 and 5% of beta values greater 0.3.
## Removing 100701 probes out of 148951
## Calculating Pp (probe - gene) for all nearby genes
## Progress disabled when using parallel plyr
## File created: analysis/primary_diagnosis-Adenocarcinoma__NOS_vs_Squamous_cell_carcinoma__NOS_supervised/hypo/getPair.hypo.all.pairs.statistic.csv
##                                 Probe          GeneID Symbol Distance
## cg13100118.ENSG00000178828 cg13100118 ENSG00000178828 RNF186   345152
## cg19533443.ENSG00000173702 cg19533443 ENSG00000173702  MUC13   -38437
## cg14981432.ENSG00000124920 cg14981432 ENSG00000124920   MYRF        0
## cg26545245.ENSG00000166866 cg26545245 ENSG00000166866  MYO1A    40016
## cg17277896.ENSG00000143375 cg17277896 ENSG00000143375    CGN    37443
## cg21157873.ENSG00000143375 cg21157873 ENSG00000143375    CGN        0
##                            Sides        Raw.p          FDR
## cg13100118.ENSG00000178828    R9 4.334152e-27 3.345954e-24
## cg19533443.ENSG00000173702    L2 6.023782e-27 3.345954e-24
## cg14981432.ENSG00000124920    L1 7.607601e-27 3.345954e-24
## cg26545245.ENSG00000166866    R4 7.909052e-27 3.345954e-24
## cg17277896.ENSG00000143375    R1 8.886298e-27 3.345954e-24
## cg21157873.ENSG00000143375    L1 8.886298e-27 3.345954e-24

We can plot the correlation heatmap of DNA methylation and gene expression with the heatmapPairs function. We will plot only the first 100 pairs.

## Subsetting
## Ordering groups

Identifying TF enriched motifs on the hypomethylated regions

We have hypomethylated regions that has a distal gene upregulated. We want to identify which is the MR TF binding to those regions and regulating the distal gene. First we identify which are the enriched motifs for those hypomethylated regions using the function get.enriched.motif. This will output the enriched motifs and the respective hypomethylated probes with the motif signature around it (since a DNA methylation probes is 1bp, we extended the search window to \(\pm\) 250 bp).

## [1] "GATA4_HUMAN.H11MO.0.A" "GATA6_HUMAN.H11MO.0.A" "HNF4A_HUMAN.H11MO.0.A"
## [4] "GATA2_HUMAN.H11MO.1.A"
##   [1] "cg22050733" "cg04275506" "cg19860160" "cg22512779" "cg19626253"
##   [6] "cg03601886" "cg13784017" "cg11959316" "cg07191152" "cg25098793"
##  [11] "cg05625834" "cg17277896" "cg26787863" "cg00075975" "cg07005353"
##  [16] "cg00577847" "cg07117596" "cg14520947" "cg23486701" "cg22107147"
##  [21] "cg20043649" "cg08818866" "cg08847775" "cg12077754" "cg27239243"
##  [26] "cg10560561" "cg05977696" "cg23367358" "cg10018933" "cg21926118"
##  [31] "cg07613752" "cg01450522" "cg24590723" "cg03865925" "cg18342703"
##  [36] "cg21425296" "cg00584840" "cg23174148" "cg22539744" "cg01377807"
##  [41] "cg04319611" "cg11578055" "cg15634747" "cg23971987" "cg01923775"
##  [46] "cg05626226" "cg03464224" "cg00796302" "cg26728709" "cg08988420"
##  [51] "cg08830157" "cg18430895" "cg08609445" "cg16016281" "cg11821200"
##  [56] "cg24606701" "cg27351783" "cg21566177" "cg03172688" "cg10505610"
##  [61] "cg11584098" "cg19692192" "cg05165087" "cg07186962" "cg08015382"
##  [66] "cg27565517" "cg15094236" "cg12663302" "cg23381884" "cg14493742"
##  [71] "cg09305161" "cg10846969" "cg23226631" "cg10784258" "cg22443940"
##  [76] "cg19929194" "cg12964881" "cg24645679" "cg00084338" "cg17287122"
##  [81] "cg27246129" "cg09022230" "cg00772192" "cg02498579" "cg08828787"
##  [86] "cg14033170" "cg18100008" "cg09422239" "cg15059474" "cg24300607"
##  [91] "cg19935531" "cg14199261" "cg05921889" "cg05744395" "cg18991321"
##  [96] "cg00987534" "cg02901002" "cg12018521" "cg01824603" "cg10087556"
## [101] "cg14533298" "cg17291136" "cg04237288" "cg18336854" "cg12726213"
## [106] "cg24027477" "cg15235922" "cg07273304" "cg26895804" "cg10131026"
## [111] "cg14325112" "cg03489427" "cg10063407" "cg01364755" "cg14509153"
## [116] "cg10476206" "cg17568611" "cg23681599" "cg10494981" "cg25227803"
## [121] "cg27065896" "cg11688410" "cg17425818" "cg04337594" "cg18203466"
## [126] "cg10955669" "cg19754622" "cg13608733" "cg26725502" "cg16311302"
## [131] "cg15722533" "cg20482760" "cg23091777" "cg03078767" "cg02856190"
## [136] "cg25015613" "cg15110243" "cg18246298" "cg02159731" "cg06978117"
## [141] "cg23327896" "cg16182447" "cg26540315" "cg18248586" "cg03554573"
## [146] "cg21127079" "cg21672401" "cg21738443" "cg25757470" "cg17046586"
## [151] "cg19091779" "cg01579458" "cg10621809" "cg17066349" "cg27614751"
## [156] "cg26545245" "cg04874286" "cg11658060" "cg13712197" "cg17843665"
## [161] "cg20360395" "cg16378261" "cg12194594" "cg16536610" "cg12128696"
## [166] "cg13995599" "cg02909176" "cg16217297" "cg07800658" "cg15935770"
## [171] "cg23639055" "cg14850604" "cg21207694" "cg19708418" "cg13594138"
## [176] "cg21523688" "cg04094640" "cg09113939" "cg07092029" "cg07098502"
## [181] "cg10020890" "cg00599850" "cg13688786" "cg04726374" "cg04136369"
## [186] "cg08659204" "cg03828329" "cg04770088" "cg04603419" "cg17415265"
## [191] "cg00777445" "cg19078037" "cg00423871" "cg09926389" "cg18788725"
## [196] "cg07604565" "cg26162564" "cg27403406" "cg21211039" "cg01327767"
## [201] "cg13758960" "cg05321038" "cg00389552"
<<<<<<< HEAD
motif.result.file <- file.path(dir.out,"getMotif.hypo.motif.enrichment.csv")
=======

<<<<<<< HEAD

=======

>>>>>>> 736b28b463b1fdd25bcd2659c86c33517475f6c4
## TableGrob (2 x 2) "arrange": 2 grobs
##   z     cells    name           grob
## 1 1 (1-1,1-1) arrange gtable[layout]
## 2 2 (1-1,2-2) arrange gtable[layout]

Inferring MR TF binding to hypomethylated regions

Now that we have our enriched motifs, we have the following question: - Which is the candidate master regulator Transcription factor binding to those regions ?

It is important to highlight that since TF within the same family and subfamily have very close motifs, they will likely be indentified to bind the same regions. You can check that by verifying the intersection of probes from our enrichment analysis results.

## [1] 203
## [1] 189
## [1] 147

So more than 70% of probes are enriched for both GATA4 and GATA6. How could we infer which of those MR might be the one with higher impact ?

For that ELMER correlates the TF expression vs the average mean methylation of the hypomethylated probes with a motif signature and rank them. We expect that a TF with higher expression and lower methylation in the binding regions will be a better MR TF candidate than TF that are for example equally expressed on both groups.

## Using pre-defined groups. U (unmethylated): Adenocarcinoma, NOS, M (methylated): Squamous cell carcinoma, NOS
## -------------------------------------------------------------------------------------------------------------------
## ** Downloading TF list from Lambert, Samuel A., et al. The human transcription factors. Cell 172.4 (2018): 650-665.
## -------------------------------------------------------------------------------------------------------------------
## Accessing TF families from TFClass database to indentify known potential TF
## Retrieving TFClass family classification from ELMER.data.
## Retrieving TFClass subfamily classification from ELMER.data.
## Calculating the average methylation at all motif-adjacent probes
## Progress disabled when using parallel plyr
## Performing Mann-Whitney U test
## Progress disabled when using parallel plyr
## Finding potential TF and known potential TF
## Progress disabled when using parallel plyr
##                                       motif top.potential.TF.family
## GATA4_HUMAN.H11MO.0.A GATA4_HUMAN.H11MO.0.A                   GATA6
##                       top.potential.TF.subfamily potential.TF.family
## GATA4_HUMAN.H11MO.0.A                      GATA6         GATA6;GATA4
##                       potential.TF.subfamily
## GATA4_HUMAN.H11MO.0.A            GATA6;GATA4

Visualizing the results with TF ranking plot

ELMER provides an easy way to visualize the rank of all human TF evaluated with the function TF.rank.plot.

<<<<<<< HEAD

=======

>>>>>>> 736b28b463b1fdd25bcd2659c86c33517475f6c4

Summarizing several ELMER analysis

MR TF heatmap

When you perform several ELMER analysis with different arguments, you might want to summarize and visualize those several results. For that, we have the ELMER:::get.tabs function , which accepts as input the directories of the results and if you want to select the MR TF within the family of subfamily classification.

In the Data/analysis_april_2019 folder we provided 4 ELMER analysis. The first step is to retrieve the directory names with the results. Also, it is a golden rule when running several ELMER runs to add a understable name to the dir.out directories, such that you easily know which analysis were ran, with which arguments. In the example below, you can identify the tumor type, type of ELMER run (supervised/unsupervised), groups compared and direction of the analysis (hypo or hyper in group 1).

In the directory analysis_april_2019 we provide 4 results from different ELMER runs.

<<<<<<< HEAD
## [1] "Data/analysis_april_2019//COAD-READ/unsupervised/definition-Primary.solid.Tumor_vs_Solid.Tissue.Normal/hypo"                             
## [2] "Data/analysis_april_2019//ESCA/unsupervised/group-Adenocarcinoma.Primary.solid.Tumor_vs_Adenocarcinoma.Solid.Tissue.Normal/hypo"         
## [3] "Data/analysis_april_2019//ESCA/unsupervised/group-Adenocarcinoma.Primary.solid.Tumor_vs_Squamous.cell.carcinoma.Primary.solid.Tumor/hypo"
## [4] "Data/analysis_april_2019//PAAD/unsupervised/definition-Primary.solid.Tumor_vs_Solid.Tissue.Normal/hypo"

With those directory names we will use the function ELMER:::get.tabs to summarize the results.

=======
## [1] "analysis_april_2019//COAD-READ/unsupervised/definition-Primary.solid.Tumor_vs_Solid.Tissue.Normal/hypo"                             
## [2] "analysis_april_2019//ESCA/unsupervised/group-Adenocarcinoma.Primary.solid.Tumor_vs_Adenocarcinoma.Solid.Tissue.Normal/hypo"         
## [3] "analysis_april_2019//ESCA/unsupervised/group-Adenocarcinoma.Primary.solid.Tumor_vs_Squamous.cell.carcinoma.Primary.solid.Tumor/hypo"
## [4] "analysis_april_2019//PAAD/unsupervised/definition-Primary.solid.Tumor_vs_Solid.Tissue.Normal/hypo"

The funtion get.tabs will summarize the results in several tables

  1. tab: a binary matrix saying if the TF was found or not in the analysis
  2. tab.pval: a matrix with the MR TF p-value found in the analysis
  3. tab.or: a matrix with the highest motif OR for the MR TF found in the analysis
  4. tf.or.table: a matrix with 4 collums: MR TF, motif, FDR and analysis it was identified.
>>>>>>> 736b28b463b1fdd25bcd2659c86c33517475f6c4
## o Creating TF binary matrix
## Joining, by = "TF"
## Joining, by = "TF"
## Joining, by = "TF"
## o Creating TF FDR matrix
## Joining, by = "motif"
## Joining, by = "motif"
## Joining, by = "motif"
## Joining, by = "motif"
## o Creating TF OR matrix
## Joining, by = "motif"
## Joining, by = "motif"
## Joining, by = "motif"
## Joining, by = "motif"
## Joining, by = "motif"
## Joining, by = "motif"
## Joining, by = "motif"
## Joining, by = "motif"
## [1] "tab"         "tab.pval"    "tab.or"      "tf.or.table"

This function creates 4 matrices using some of the files in each analysis directory: 1. a binary matrix saying if the TF was found or not in the analysis: this function parses each getTF.hypo.significant.TFs.with.motif.summary.csv (create after you run the get.TFs function) file that has the MR TF for each enriched motif.

======= for (i in 1:3) colnames(tabs[[i]]) <- analysis.names # 1. a binary matrix saying if the TF was found or not in the analysis tab.binary <- tabs$tab head(tab.binary)
>>>>>>> 736b28b463b1fdd25bcd2659c86c33517475f6c4
##         COAD-READ (vs Normal) EAC (vs Normal) EAC (vs ESCC)
## ARID3A                      1               0             0
## FOXQ1                       1               0             0
## HNF1A                       1               1             1
## PDX1                        1               0             0
## POU5F1B                     1               0             0
## SOX9                        1               0             0
##         PAAD (vs Normal)
## ARID3A                 0
## FOXQ1                  0
## HNF1A                  1
## PDX1                   0
## POU5F1B                0
## SOX9                   1
<<<<<<< HEAD
  1. a matrix with the MR TF p-value found in the analysis: this uses getTF.hypo.significant.TFs.with.motif.summary.csv to identify the MR TF and getTF.hypo.TFs.with.motif.pvalue.rda, which is a matrix with TF as columns and enriched motifs as rows and the contains the p-value of the wilcoxon test of the TF expression for the Unmethylated vs Methylated groups.
======= >>>>>>> 736b28b463b1fdd25bcd2659c86c33517475f6c4
##         COAD-READ (vs Normal) EAC (vs Normal) EAC (vs ESCC)
## ARID3A           8.397450e-08              NA            NA
## FOXQ1            3.774568e-08              NA            NA
## HNF1A            5.845199e-10     0.007700548  1.689635e-09
## PDX1             3.243946e-09              NA            NA
## POU5F1B          1.259413e-10              NA            NA
## SOX9             8.410768e-08              NA            NA
##         PAAD (vs Normal)
## ARID3A                NA
## FOXQ1                 NA
## HNF1A       1.777426e-07
## PDX1                  NA
## POU5F1B               NA
## SOX9        2.017003e-07
<<<<<<< HEAD
  1. a matrix with the highest motif OR for the MR TF found in the analysis: this function uses the file getTF.hypo.significant.TFs.with.motif.summary.csv to get the MR TF and then uses getMotif.hypo.motif.enrichment.csv to get the motifs OR to which the MR TF is known to bind and returns the highest OR.
======= >>>>>>> 736b28b463b1fdd25bcd2659c86c33517475f6c4
##         COAD-READ (vs Normal) EAC (vs Normal) EAC (vs ESCC)
## ARID3A               1.348007              NA            NA
## FOXQ1                1.362852              NA            NA
## HNF1A                1.471907        1.501869      2.190133
## PDX1                 1.730954              NA            NA
## POU5F1B              1.786492              NA            NA
## SOX9                 1.689930              NA            NA
##         PAAD (vs Normal)
## ARID3A                NA
## FOXQ1                 NA
## HNF1A           1.960243
## PDX1                  NA
## POU5F1B               NA
## SOX9            1.684580
<<<<<<< HEAD
  1. a matrix with 4 collums: MR TF, motif, FDR and analysis it was identified: This function uses the same files as the previous matrices, but returns the results in a different format.
======= >>>>>>> 736b28b463b1fdd25bcd2659c86c33517475f6c4
##         TF                 motif          FDR              analysis
## 7   ZNF263  ZIM3_HUMAN.H11MO.0.C 7.120537e-11 COAD-READ (vs Normal)
## 5  POU5F1B P5F1B_HUMAN.H11MO.0.D 1.259413e-10 COAD-READ (vs Normal)
## 8  ZSCAN16 ZNF85_HUMAN.H11MO.0.C 3.818943e-10 COAD-READ (vs Normal)
## 3    HNF1A HNF1B_HUMAN.H11MO.1.A 5.845199e-10 COAD-READ (vs Normal)
## 17   HNF1A HNF1B_HUMAN.H11MO.1.A 1.689635e-09         EAC (vs ESCC)
## 21   FOXA3 FOXA1_HUMAN.H11MO.0.A 1.806201e-09         EAC (vs ESCC)
<<<<<<< HEAD

Creating Summary plots

With those summarized results we can create two summary plots:

  1. A heatmap with the MR TFs identified in each analysis.
  2. A scatter plot of the TF expression vs the avg DNA methylation of the TFBS.

MR TF heatmap

The code below creates the heatmap using complexHeatmap package for the binary MR TF matrix and the p-value one.

library(ComplexHeatmap)
library(vegan)

col <- ifelse(classification == "family","top.potential.TF.family", "top.potential.TF.subfamily")

analysis.colors <- c(
  "COAD-READ (vs Normal)" = '#ff964f',
  "EAC (vs Normal)" = '#94e894',
  "EAC (vs ESCC)" = '#eaadb9',
  "PAAD (vs Normal)" = '#9ab9f9'                   
)

# get top TFs in each anaylsis
labels <- ELMER:::get.top.tf.by.pval(tab.pval,top = nrow(tab.pval))

hb = HeatmapAnnotation(df = data.frame("Mode" = rep("unsupervised",4),
                                       "Analysis" = analysis.names),
                       col = list("Analysis" = analysis.colors,
                                  "Mode" = c("unsupervised" = "#ebd540",
                                             "supervised" = "#718da5")),
                       show_annotation_name = FALSE,
                       show_legend = T,
                       annotation_name_side = "left",
                       annotation_name_gp = gpar(fontsize = 12))

# Binary heatmap
ht_list_binary <- Heatmap(as.matrix(tab.binary),
                   name = "Binary heatmap" ,
                   col = c("1" = "black", "0" = "white"),
                   column_names_gp = gpar(fontsize = 8),
                   show_column_names = FALSE,
                   use_raster = TRUE,
                   na_col = "white",
                   raster_device = c("png"),
                   raster_quality = 2,
                   show_row_names = F,
                   show_heatmap_legend = TRUE,
                   cluster_columns = FALSE,
                   cluster_rows = TRUE,
                   show_column_dend = FALSE,
                   show_row_dend = FALSE,
                   clustering_distance_rows = function(x) {
                     vegan::vegdist(tab.binary,method = "jaccard",binary = TRUE)
                   },
                   clustering_method_rows = "average",
                   top_annotation = hb,
                   width = unit(5, "cm"),
                   row_names_gp = gpar(fontsize = 1),
                   column_title = paste0("Binrary MR TF Heatmap"), 
                   column_title_gp = gpar(fontsize = 10), 
                   row_title_gp = gpar(fontsize = 16)) 

ht_list_binary <-  ht_list_binary + 
  rowAnnotation(link = anno_mark(at = match(labels,rownames(tab.pval)), 
                                 labels = labels,labels_gp = gpar(fontsize = 10)),
                width = unit(1, "cm") + max_text_width(labels)
  ) 
draw(ht_list_binary,
     newpage = FALSE, 
     column_title_gp = gpar(fontsize = 12, fontface = "bold"),
     heatmap_legend_side = "left",
     show_heatmap_legend = TRUE,
     annotation_legend_side = "right")

MR TF expression vs DNA methylation at binding motifs

The main difference from the last plot is the requirement of the MAE object with the DNA methylation and gene expression data. For a given MR TF, we need to indentify the top enriched motif it binds and then for each enriched motifs its distal paired differently methylated probes using the getMotif.hypo.enriched.motifs.rda object within the results folder.

Also, since we only provide one object used for two analysis (“EAC (vs Normal)” and “EAC (vs ESCC)” ) we will only plot the MR TF expression vs DNA methylation at binding motifs for them.

analsysis.dir.esca <- unique(dirname(dir("Data/analysis_april_2019/ESCA/",recursive = T,full.names = T)))
mae <- readRDS("Data/TCGA_ESCA_MAE_distal_regions.rds")

tab.pval.esca <- tab.pval[,2:3]

suppressMessages({
library(ELMER)
library(grid)
library(ggplot2)
library(ggpubr)
library(dplyr)
})


analysis.colors <- c(
  "COAD-READ (vs Normal)" = '#ff964f',
  "EAC (vs Normal)" = '#94e894',
  "EAC (vs ESCC)" = '#eaadb9',
  "PAAD (vs Normal)" = '#9ab9f9'                   
)

# Top has the number of MRTF to plot
for (top in c(5)) {
  
  scatter.list <- list()
  scatter.grid <- list()
  scatter.labels <- c()
  
  for (path in analsysis.dir.esca) {
    # Load of object with enriched motifs and the probes mapped to it
    load(dir(path = path, 
             pattern = "getMotif.hypo.enriched.motifs.rda",
             recursive = T,full.names = T)    
    )
    i <- which(path == analsysis.dir.esca)
    
    tumor <- basename(dirname(dirname(dirname(path))))
    mae.plot <- mae
      
     # Change the data to plot based on the analysis performed  
    if (grepl("Squamous",path)) {
      category <- "primary_diagnosis"
      # keep only samples used in the analysis
      mae.plot <- mae.plot[,mae.plot$primary_diagnosis %in% c("Adenocarcinoma, NOS", 
                                                              "Squamous cell carcinoma, NOS") & 
                             mae.plot$definition %in% c("Primary solid Tumor")]          
    } else {
      category <- "definition"
      # keep only samples used in the analysis
      mae.plot <- mae.plot[,mae.plot$primary_diagnosis %in% c("Adenocarcinoma, NOS") & 
                             mae.plot$definition %in% c("Primary solid Tumor","Solid Tissue Normal")]
    }
    # get top Tfs
    topTFs <- rownames(tab.pval)[head(sort(DelayedArray::rowMins(tab.pval.esca[,i,drop = F],na.rm = T), 
                                           index.return = TRUE, decreasing = F)$ix, n = top)]
    topTFs <- topTFs[1:top]
    for (j in topTFs) {
      TF <- j
      scatter.labels <- c(scatter.labels,colnames(tab.pval.esca)[i])
      
      motif <- tf.or.table[tf.or.table$TF == j & 
                             tf.or.table$analysis == colnames(tab.pval.esca)[i],"motif"][1]
      
      # Change the dots color based on the analysis performed  
      if (grepl("Squamous",path))   color.value <- c("#FF0000","#0000FF")
      if (grepl("Adenocarcinoma.Solid.Tissue.Normal",path)) color.value <- c("#FF0000","#176d55")
      
      if (is.na(motif)) next
        
        # Plot TF expression vs avg DNA methylation of the TFBS
        suppressMessages({
          scatter <- scatter.plot(data = mae.plot,
                              correlation = TRUE,
                              byTF = list(TF = TF,
                                          probe = enriched.motif[[motif]]), 
                              category = category,
                              ylim = c(0,25), 
                              dots.size = 0.8,
                              color.value = color.value,
                              save = FALSE, 
                              lm_line = TRUE) +
            ylab(bquote(.(TF) ~ log[2](RSEM + 1))) + 
            xlab(paste0("Avg. DNA met. at \n", length(enriched.motif[[motif]]),
                    " CpGs w/ \n enriched motif ",
                    sub("_HUMAN.H11MO.[0-1].[A-D]", "",  motif))) + 
            theme(legend.position = "none", 
                  plot.title = element_blank(),
                  axis.title.y = element_text(size = 8, face = "bold"),
                  axis.title.x = element_text(size = 8, face = "bold"),
                  strip.background = element_rect(fill = analysis.colors[i + 1]),
                  panel.grid.major = element_blank(),
                  panel.grid.minor = element_blank(),
                  # Remove panel background
                  panel.background = element_blank()
        )
      })
      
      # If this is the first plot of the analysis we will add a labels (rectangle + text)  
      if(j == topTFs[1]){
        gt <- gtable::gtable(widths = unit(0.5,"cm"), heights = unit(3, "in")) %>%
          gtable::gtable_add_grob(grid::rectGrob(gp=gpar(fill= analysis.colors[i + 1])), 1, 1, name = 1) %>% 
          gtable::gtable_add_grob(grid::textGrob(colnames(tab.pval.esca)[i], 
                                                 rot = 90, 
                                                 gp = grid::gpar(fontsize = 8)), 1, 1, name = 2)
        scatter <-  scatter %>% ggpubr::annotate_figure(left = gt)
      }
      scatter.list[[paste0(i,j)]] <-  scatter
      
    }
    scatter.grid[[i]] <-  ggpubr::ggarrange(plotlist = scatter.list[grep(paste0("^",i),names(scatter.list))], 
                                            ncol = top,
                                            # the first column should be wider since it has the label
                                            widths = c(1.15,rep(1,top - 1)), 
                                            nrow = 1, 
                                            common.legend = T,
                                            legend = "bottom")
  }     
}
options(repr.plot.width=20, repr.plot.height=8)

# we use ggarrange to arrange the plots together
ggpubr::ggarrange(plotlist = scatter.grid, 
                  ncol = 1,
                  # the first column should be wider since it has the label
                  widths = c(1.15,rep(1,top - 1)), 
                  nrow = 2, # Number of analysis - rows of the plot
                  heights = c(rep(1.5,length(scatter.list) - 1), 2), 
                  common.legend = F,
                  legend = "bottom")

=======
library(ComplexHeatmap)
library(vegan)

col <- ifelse(classification == "family","top.potential.TF.family", "top.potential.TF.subfamily")

analysis.colors <- c(
  "COAD-READ (vs Normal)" = '#ff964f',
  "EAC (vs Normal)" = '#94e894',
  "EAC (vs ESCC)" = '#eaadb9',
  "PAAD (vs Normal)" = '#9ab9f9'                   
)

# get top TFs in each anaylsis
labels <- ELMER:::get.top.tf.by.pval(tab.pval,top = nrow(tab.pval))

hb = HeatmapAnnotation(df = data.frame("Mode" = rep("unsupervised",4),
                                       "Analysis" = analysis.names),
                       col = list("Analysis" = analysis.colors,
                                  "Mode" = c("unsupervised" = "#ebd540",
                                             "supervised" = "#718da5")),
                       show_annotation_name = FALSE,
                       show_legend = T,
                       annotation_name_side = "left",
                       annotation_name_gp = gpar(fontsize = 12))

# Binary heatmap
ht_list <- Heatmap(as.matrix(tab.binary),
                   name = "Binary heatmap" ,
                   col = c("1" = "black", "0" = "white"),
                   column_names_gp = gpar(fontsize = 8),
                   show_column_names = FALSE,
                   use_raster = TRUE,
                   na_col = "white",
                   raster_device = c("png"),
                   raster_quality = 2,
                   show_row_names = F,
                   show_heatmap_legend = TRUE,
                   cluster_columns = FALSE,
                   cluster_rows = TRUE,
                   clustering_distance_rows = function(x) {
                     vegan::vegdist(tab.binary,method = "jaccard",binary = TRUE)
                   },
                   clustering_method_rows = "average",
                   top_annotation = hb,
                   width = unit(5, "cm"),
                   row_names_gp = gpar(fontsize = 1),
                   column_title = paste0("Binrary MR TF Heatmap"), 
                   column_title_gp = gpar(fontsize = 10), 
                   row_title_gp = gpar(fontsize = 16)) 

ht_list <-  ht_list + 
  rowAnnotation(link = anno_mark(at = match(labels,rownames(tab.pval)), 
                                 labels = labels,labels_gp = gpar(fontsize = 10)),
                width = unit(1, "cm") + max_text_width(labels)
  ) 
draw(ht_list,
     newpage = FALSE, 
     column_title_gp = gpar(fontsize = 12, fontface = "bold"),
     heatmap_legend_side = "right",
     show_heatmap_legend = TRUE,
     annotation_legend_side = "right")

MR TF expression vs DNA methylation at binding motifs

Since we only provide one object used for two analysis (“EAC (vs Normal)” and “EAC (vs ESCC)” ) we will only plot the MR TF expression vs DNA methylation at binding motifs for them.

analsysis.dir.esca <- unique(dirname(dir("analysis_april_2019/ESCA/",recursive = T,full.names = T)))
mae <- readRDS("TCGA_ESCA_MAE_distal_regions.rds")

tab.pval.esca <- tab.pval[,2:3]

library(ELMER)
library(grid)
library(ggplot2)
library(ggpubr)
library(dplyr)

for (top in c(5)) {
  
  scatter.list <- list()
  scatter.grid <- list()
  scatter.labels <- c()
  
  for (path in analsysis.dir.esca) {
    # Load of object with enriched motifs
    load(dir(path = path, 
             pattern = "getMotif.hypo.enriched.motifs.rda",
             recursive = T,full.names = T)    
    )
    i <- which(path == analsysis.dir.esca)
    
    tumor <- basename(dirname(dirname(dirname(path))))
    mae.plot <- mae
    if (grepl("Squamous",path)) {
      category <- "primary_diagnosis"
      # keep only samples used in the analysis
      mae.plot <- mae.plot[,mae.plot$primary_diagnosis %in% c("Adenocarcinoma, NOS", 
                                                              "Squamous cell carcinoma, NOS") & 
                             mae.plot$definition %in% c("Primary solid Tumor")]          
    } else {
      category <- "definition"
      # keep only samples used in the analysis
      mae.plot <- mae.plot[,mae.plot$primary_diagnosis %in% c("Adenocarcinoma, NOS") & 
                             mae.plot$definition %in% c("Primary solid Tumor","Solid Tissue Normal")]
    }
    # get top Tfs
    topTFs <- rownames(tab.pval)[head(sort(DelayedArray::rowMins(tab.pval.esca[,i,drop = F],na.rm = T), 
                                           index.return = TRUE, decreasing = F)$ix, n = top)]
    topTFs <- topTFs[1:top]
    for (j in topTFs) {
      TF <- j
      scatter.labels <- c(scatter.labels,colnames(tab.pval.esca)[i])
      
      motif <- tf.or.table[tf.or.table$TF == j & 
                             tf.or.table$analysis == colnames(tab.pval.esca)[i],"motif"][1]
      
      if (grepl("Squamous",path))   color.value <- c("#FF0000","#0000FF")
      if (grepl("Adenocarcinoma.Solid.Tissue.Normal",path)) color.value <- c("#FF0000","#176d55")
      
      if (is.na(motif)) next
      scatter <- scatter.plot(data = mae.plot,
                              correlation = TRUE,
                              byTF = list(TF = TF,
                                          probe = enriched.motif[[motif]]), 
                              category = category,
                              ylim = c(0,25), 
                              dots.size = 0.8,
                              color.value = color.value,
                              save = FALSE, 
                              lm_line = TRUE) +
        ylab(bquote(.(TF) ~ log[2](RSEM + 1))) + 
        xlab(paste0("Avg. DNA met. at \n", length(enriched.motif[[motif]]),
                    " CpGs w/ \n enriched motif ",
                    sub("_HUMAN.H11MO.[0-1].[A-D]", "",  motif))
        ) + theme(legend.position = "none", 
                  plot.title = element_blank(),
                  axis.title.y = element_text(size = 8, face = "bold"),
                  axis.title.x = element_text(size = 8, face = "bold"),
                  strip.background = element_rect(fill = analysis.colors[i + 1]),
                  panel.grid.major = element_blank(),
                  panel.grid.minor = element_blank(),
                  # Remove panel background
                  panel.background = element_blank()
        )
      
      if(j == topTFs[1]){
        gt <- gtable::gtable(widths = unit(0.5,"cm"), heights = unit(3, "in")) %>%
          gtable::gtable_add_grob(grid::rectGrob(gp=gpar(fill= analysis.colors[i + 1])), 1, 1, name = 1) %>% 
          gtable::gtable_add_grob(grid::textGrob(colnames(tab.pval.esca)[i], 
                                                 rot = 90, 
                                                 gp = grid::gpar(fontsize = 8)), 1, 1, name = 2)
        scatter <-  scatter %>% ggpubr::annotate_figure(left = gt)
      }
      scatter.list[[paste0(i,j)]] <-  scatter
      
    }
    scatter.grid[[i]] <-  ggpubr::ggarrange(plotlist = scatter.list[grep(paste0("^",i),names(scatter.list))], 
                                            ncol = top,
                                            # the first column should be wider since it has the label
                                            widths = c(1.15,rep(1,top - 1)), 
                                            nrow = 1, 
                                            common.legend = T,
                                            legend = "bottom")
  }     
}
ggpubr::ggarrange(plotlist = scatter.grid, 
                  ncol = 1,
                  # the first column should be wider since it has the label
                  widths = c(1.15,rep(1,top - 1)), 
                  nrow = 2, # Number of analysis
                  heights = c(rep(1.5,length(scatter.list) - 1), 2), 
                  #labels="AUTO", # This will add A B C D etc. to the plot
                  common.legend = F,
                  legend = "bottom")

>>>>>>> 736b28b463b1fdd25bcd2659c86c33517475f6c4

Code to create the data used in the workshop

The function getTCGA uses TCGAbiolinks to retrieve TCGA data in ELMER accept format.

Session information

<<<<<<< HEAD ======= >>>>>>> 736b28b463b1fdd25bcd2659c86c33517475f6c4
## R version 3.6.0 (2019-04-26)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.5
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
##  [1] grid      parallel  stats4    stats     graphics  grDevices utils    
##  [8] datasets  methods   base     
## 
## other attached packages:
##  [1] dplyr_0.8.3                 ggpubr_0.2.1               
##  [3] magrittr_1.5                ggplot2_3.2.0              
##  [5] vegan_2.5-5                 lattice_0.20-38            
##  [7] permute_0.9-5               ComplexHeatmap_2.0.0       
##  [9] MultiAssayExperiment_1.10.4 SummarizedExperiment_1.14.0
## [11] DelayedArray_0.10.0         BiocParallel_1.18.0        
## [13] matrixStats_0.54.0          Biobase_2.44.0             
## [15] GenomicRanges_1.36.0        GenomeInfoDb_1.20.0        
## [17] IRanges_2.18.1              S4Vectors_0.22.0           
## [19] BiocGenerics_0.30.0         ELMER_2.9.2                
## [21] ELMER.data_2.9.3           
## 
## loaded via a namespace (and not attached):
<<<<<<< HEAD
##   [1] utf8_1.1.4                  R.utils_2.9.0              
##   [3] tidyselect_0.2.5            RSQLite_2.1.1              
##   [5] AnnotationDbi_1.46.0        htmlwidgets_1.3            
##   [7] DESeq_1.36.0                munsell_0.5.0              
##   [9] codetools_0.2-16            withr_2.1.2                
##  [11] colorspace_1.4-1            knitr_1.23                 
##  [13] rstudioapi_0.10             ggsignif_0.5.0             
##  [15] labeling_0.3                GenomeInfoDbData_1.2.1     
##  [17] hwriter_1.3.2               KMsurv_0.1-5               
##  [19] bit64_0.9-7                 downloader_0.4             
##  [21] vctrs_0.2.0                 generics_0.0.2             
##  [23] xfun_0.8                    biovizBase_1.32.0          
##  [25] ggthemes_4.2.0              BiocFileCache_1.8.0        
##  [27] EDASeq_2.18.0               R6_2.4.0                   
##  [29] doParallel_1.0.14           clue_0.3-57                
##  [31] locfit_1.5-9.1              AnnotationFilter_1.8.0     
##  [33] bitops_1.0-6                reshape_0.8.8              
##  [35] assertthat_0.2.1            scales_1.0.0               
##  [37] nnet_7.3-12                 gtable_0.3.0               
##  [39] sva_3.32.1                  ensembldb_2.8.0            
##  [41] rlang_0.4.0                 zeallot_0.1.0              
##  [43] genefilter_1.66.0           cmprsk_2.2-8               
##  [45] GlobalOptions_0.1.0         splines_3.6.0              
##  [47] rtracklayer_1.44.0          lazyeval_0.2.2             
##  [49] acepack_1.4.1               dichromat_2.0-0            
##  [51] selectr_0.4-1               broom_0.5.2                
##  [53] checkmate_1.9.4             yaml_2.2.0                 
##  [55] GenomicFeatures_1.36.4      backports_1.1.4            
##  [57] Hmisc_4.2-0                 tools_3.6.0                
##  [59] RColorBrewer_1.1-2          Rcpp_1.0.1                 
##  [61] plyr_1.8.4                  base64enc_0.1-3            
##  [63] progress_1.2.2              zlibbioc_1.30.0            
##  [65] purrr_0.3.2                 RCurl_1.95-4.12            
##  [67] prettyunits_1.0.2           rpart_4.1-15               
##  [69] openssl_1.4.1               GetoptLong_0.1.7           
##  [71] cowplot_1.0.0               zoo_1.8-6                  
##  [73] ggrepel_0.8.1               cluster_2.1.0              
##  [75] data.table_1.12.2           circlize_0.4.6             
##  [77] survminer_0.4.4             ProtGenerics_1.16.0        
##  [79] aroma.light_3.14.0          hms_0.5.0                  
##  [81] evaluate_0.14               xtable_1.8-4               
##  [83] XML_3.98-1.20               gridExtra_2.3              
##  [85] shape_1.4.4                 compiler_3.6.0             
##  [87] biomaRt_2.41.7              tibble_2.1.3               
##  [89] crayon_1.3.4                R.oo_1.22.0                
##  [91] htmltools_0.3.6             mgcv_1.8-28                
##  [93] Formula_1.2-3               tidyr_0.8.3                
##  [95] geneplotter_1.62.0          DBI_1.0.0                  
##  [97] dbplyr_1.4.2                matlab_1.0.2               
##  [99] MASS_7.3-51.4               rappdirs_0.3.1             
## [101] ShortRead_1.42.0            Matrix_1.2-17              
## [103] readr_1.3.1                 cli_1.1.0                  
## [105] R.methodsS3_1.7.1           Gviz_1.28.0                
## [107] pkgconfig_2.0.2             km.ci_0.5-2                
## [109] prettydoc_0.2.1             GenomicAlignments_1.20.1   
## [111] foreign_0.8-71              plotly_4.9.0               
## [113] xml2_1.2.0                  foreach_1.4.4              
## [115] annotate_1.62.0             XVector_0.24.0             
## [117] rvest_0.3.4                 stringr_1.4.0              
## [119] VariantAnnotation_1.30.1    digest_0.6.20              
## [121] ConsensusClusterPlus_1.48.0 Biostrings_2.52.0          
## [123] rmarkdown_1.13              TCGAbiolinks_2.13.3        
## [125] survMisc_0.5.5              htmlTable_1.13.1           
## [127] edgeR_3.26.5                curl_3.3                   
## [129] Rsamtools_2.0.0             rjson_0.2.20               
## [131] nlme_3.1-140                jsonlite_1.6               
## [133] viridisLite_0.3.0           askpass_1.1                
## [135] limma_3.40.2                BSgenome_1.52.0            
## [137] fansi_0.4.0                 pillar_1.4.2               
## [139] httr_1.4.0                  survival_2.44-1.1          
## [141] glue_1.3.1                  png_0.1-7                  
## [143] iterators_1.0.10            bit_1.1-14                 
## [145] stringi_1.4.3               blob_1.2.0                 
## [147] latticeExtra_0.6-28         memoise_1.1.0

Workshop materials

Workshops HTMLs

Workshop videos

We have a set of recorded videos, explaining some of the workshops.

======= ## [1] backports_1.1.4 circlize_0.4.6 ## [3] Hmisc_4.2-0 aroma.light_3.14.0 ## [5] plyr_1.8.4 selectr_0.4-1 ## [7] ConsensusClusterPlus_1.48.0 lazyeval_0.2.2 ## [9] splines_3.6.0 sva_3.32.1 ## [11] digest_0.6.19 foreach_1.4.4 ## [13] ensembldb_2.8.0 htmltools_0.3.6 ## [15] fansi_0.4.0 checkmate_1.9.3 ## [17] memoise_1.1.0 BSgenome_1.52.0 ## [19] cluster_2.0.9 doParallel_1.0.14 ## [21] limma_3.40.2 Biostrings_2.52.0 ## [23] readr_1.3.1 annotate_1.62.0 ## [25] R.utils_2.8.0 prettyunits_1.0.2 ## [27] colorspace_1.4-1 blob_1.1.1 ## [29] rvest_0.3.4 ggrepel_0.8.1 ## [31] xfun_0.7 crayon_1.3.4 ## [33] RCurl_1.95-4.12 jsonlite_1.6 ## [35] genefilter_1.66.0 zeallot_0.1.0 ## [37] zoo_1.8-6 survival_2.44-1.1 ## [39] VariantAnnotation_1.30.1 iterators_1.0.10 ## [41] glue_1.3.1 survminer_0.4.4 ## [43] gtable_0.3.0 zlibbioc_1.30.0 ## [45] XVector_0.24.0 GetoptLong_0.1.7 ## [47] shape_1.4.4 scales_1.0.0 ## [49] DESeq_1.36.0 DBI_1.0.0 ## [51] edgeR_3.26.4 ggthemes_4.2.0 ## [53] Rcpp_1.0.1 viridisLite_0.3.0 ## [55] cmprsk_2.2-8 xtable_1.8-4 ## [57] progress_1.2.2 htmlTable_1.13.1 ## [59] clue_0.3-57 foreign_0.8-71 ## [61] bit_1.1-14 matlab_1.0.2 ## [63] km.ci_0.5-2 Formula_1.2-3 ## [65] htmlwidgets_1.3 httr_1.4.0 ## [67] RColorBrewer_1.1-2 acepack_1.4.1 ## [69] reshape_0.8.8 pkgconfig_2.0.2 ## [71] XML_3.98-1.20 R.methodsS3_1.7.1 ## [73] Gviz_1.28.0 nnet_7.3-12 ## [75] utf8_1.1.4 locfit_1.5-9.1 ## [77] labeling_0.3 tidyselect_0.2.5 ## [79] rlang_0.3.4 AnnotationDbi_1.46.0 ## [81] munsell_0.5.0 tools_3.6.0 ## [83] cli_1.1.0 downloader_0.4 ## [85] generics_0.0.2 RSQLite_2.1.1 ## [87] broom_0.5.2 evaluate_0.14 ## [89] stringr_1.4.0 yaml_2.2.0 ## [91] knitr_1.23 bit64_0.9-7 ## [93] survMisc_0.5.5 purrr_0.3.2 ## [95] AnnotationFilter_1.8.0 TCGAbiolinks_2.13.1 ## [97] nlme_3.1-140 EDASeq_2.18.0 ## [99] R.oo_1.22.0 xml2_1.2.0 ## [101] biomaRt_2.41.3 compiler_3.6.0 ## [103] rstudioapi_0.10 plotly_4.9.0 ## [105] curl_3.3 png_0.1-7 ## [107] tibble_2.1.3 geneplotter_1.62.0 ## [109] stringi_1.4.3 GenomicFeatures_1.36.1 ## [111] ProtGenerics_1.16.0 Matrix_1.2-17 ## [113] vctrs_0.1.0 KMsurv_0.1-5 ## [115] pillar_1.4.1 GlobalOptions_0.1.0 ## [117] cowplot_0.9.4 data.table_1.12.2 ## [119] bitops_1.0-6 rtracklayer_1.44.0 ## [121] R6_2.4.0 latticeExtra_0.6-28 ## [123] hwriter_1.3.2 ShortRead_1.42.0 ## [125] gridExtra_2.3 codetools_0.2-16 ## [127] dichromat_2.0-0 MASS_7.3-51.4 ## [129] assertthat_0.2.1 rjson_0.2.20 ## [131] withr_2.1.2 GenomicAlignments_1.20.0 ## [133] Rsamtools_2.0.0 GenomeInfoDbData_1.2.1 ## [135] mgcv_1.8-28 hms_0.4.2 ## [137] prettydoc_0.2.1 rpart_4.1-15 ## [139] tidyr_0.8.3 rmarkdown_1.13 ## [141] biovizBase_1.32.0 base64enc_0.1-3 >>>>>>> 736b28b463b1fdd25bcd2659c86c33517475f6c4